Development of Novel KASP Markers for Improved Germination in Deep-Sown Direct Seeded Rice

Background The lack of stable-high yielding and direct-seeded adapted varieties with better germination ability from deeper soil depth and availability of molecular markers are major limitation in achieving the maximum yield potential of rice under water and resource limited conditions. Development of high-throughput and trait-linked markers are of great interest in genomics-assisted breeding. The aim of present study was to develop and validate novel KASP (Kompetitive Allele-Specific PCR) markers associated with traits improving germination and seedling vigor of deep sown direct seeded rice (DSR). Results Out of 58 designed KASP assays, four KASP assays did not show any polymorphism in any of the eleven genetic backgrounds considered in the present study. The 54 polymorphic KASP assays were then validated for their robustness and reliability on the F1s plants developed from eight different crosses considered in the present study. The third next validation was carried out on 256 F3:F4 and 713 BC3F2:3 progenies. Finally, the reliability of the KASP assays was accessed on a set of random 50 samples from F3:F4 and 80–100 samples from BC3F2:3 progenies using the 10 random markers. From the 54 polymorphic KASP, based on the false positive rate, false negative rate, KASP utility in different genetic backgrounds and significant differences in the phenotypic values of the positive (desirable) and negative (undesirable) traits, a total of 12 KASP assays have been selected. These 12 KASP include 5 KASP on chromosome 3, 1 on chromosome 4, 3 on chromosome 7 and 3 on chromosome 8. The two SNPs lying in the exon regions of LOC_Os04g34290 and LOC_Os08g32100 led to non-synonymous mutations indicating a possible deleterious effect of the SNP variants on the protein structure. Conclusion The present research work will provide trait-linked KASP assays, improved breeding material possessing favourable alleles and breeding material in form of expected pre-direct-seeded adapted rice varieties. The marker can be utilized in introgression program during pyramiding of valuable QTLs/genes providing adaptation to rice under DSR. The functional studies of the genes LOC_Os04g34290 and LOC_Os08g32100 possessing two validated SNPs may provide valuable information about these genes. Supplementary Information The online version contains supplementary material available at 10.1186/s12284-024-00711-1.


Background
Rice (Oryza sativa) is a world's major food crop.Considering the shortage of water and labor, and the advances in the agricultural mechanization, the direct seeded rice (DSR) appears as an alternative method of rice cultivation.The poor seed germination, serious weed infestation, and low seedling vigor are one among the major problems leading substantial yield loss in DSR cultivation system.The success of direct seeding relies strongly on the development of rice varieties with robust crop establishment (Kumar and Ladha 2011;Mahender et al. 2015).The broadcasting or surface seeding of rice may lead to poor establishment and the uneven crop stand due to predation, drought, rain splashing, greater vapour pressure gradient, and high temperature (Kumar and Ladha 2011;Yamauchi and Winn 1996).Instead, the deep sowing is an effective alternative method ensuring the seeds are fully protected, less vulnerable to pests and can access the available moisture from greater soil depths.The poor seedling emergence and establishment, low dry matter accumulation (Loeppky et al. 1989) caused by the deep sowing of rice greatly restrict the deep sown DSR technology.The mesocotyl length, along with seedling emergence and establishment are three important traits for determining high rice yields in deep sown DSR systems (Lee et al. 2017;Turner et al. 1982;Wu et al. 2015;Lu et al. 2016).The mesocotyl elongation is affected by various factors including light, water, soil depth and temperature.The plant hormones such as brassinosteroid (BR), abscisic acid (ABA), cytokinin (CTK), strigolactones (SLs), ethylene (ETH), gibberellin (GA), indole-3-acetic acid (IAA), and jasmonic acid (JA) play an important role in regulating the mesocotyl elongation.Earlier reports (Mahender et al. 2015;Turner et al. 1982;Dilday et al. 1990) suggested that the emergence rate of drill-seeded semi-dwarf rice genotypes is much lower and less uniform than the non-dwarf types genotypes with long mesocotyl.For the successful crop establishment under deep sown DSR, the DSR adapted rice varieties should have higher germination, faster seedling emergence with more vigorous growth and longer mesocotyl.
Quantitative Trait Loci (QTL) mapping facilitating the identification of targeted genomic regions associated with the favorable traits (Collard and Mackill 2008), gene pyramiding involving the simultaneous incorporation of multiple genes governing various traits into a single plant, resulting in varieties with a comprehensive array of desirable characteristics (Xu et al. 2017).The marker development enabling breeders to make accurate selections during the breeding process has emerged as a powerful strategy (Collard and Mackill 2008).
The advances in crop genome sequencing over few decades have had huge impacts on our knowledge to develop novel SNP (single nucleotide polymorphism) based molecular markers (Przewieslik-Allen et al. 2019) which have largely replaced the SSRs (simple sequence repeats) in cereal crop species (Semagn et al. 2014).Development of novel markers enables breeders to precisely identify and select the breeding lines/germplasm possessing desired traits.The efficient high-throughput ideal DNA markers possess the essential traits such as co-dominant inheritance, high genomic abundance and polymorphism, lower error rate, dense distribution, and seamless automation.The knock out effect of the widespread adoption of novel molecular markers have been seen in developing genomics-assisted breeding lines.
The SNP's based markers have emerged as a powerful tool in various genetic applications including germplasm characterization and quality assessment, linkage mapping, association mapping, allele mining, marker-assisted selection and backcrossing, and genomic selection, (Rafalski 2002;Schlotterer 2004;Semagn et al. 2014).The high-throughput SNP genotyping platform, Kompetitive Allele-Specific PCR (KASP) assay has evolved as a global benchmark technology.KASP markers are being widely used for the genetic mapping and trait-specific marker development due to their low cost and low genotyping error rates, high reliability, and reproducibility (He et al. 2014;Ertiro et al. 2015;Rasheed et al. 2016;Tan et al. 2017).
A novel core-set of 110 KASP markers associated with traits improving grain yield and adaptability under DSR cultivation conditions was developed and validated (Sandhu et al. 2022).The developed KASP markers are now being routinely used in the genomics-assisted breeding programs for characterizing the breeding material with respect to important QTL/genes affecting grain yield, adaptability, biotic/abiotic stress tolerance/resistance under DSR cultivation conditions.A total of 71,311 KASP SNP markers with average density of 34 KASP/ Mb from the RNA-Seq data have been developed for map-based cloning and the marker-assisted selection in maize (Chen et al. 2021).High density SNP arrays are available for the crop species including rice (Yu et al. 2014;Thomson et al. 2014;Chen et al. 2014;Singh et al. 2015), wheat (Allen et al. 2017), barley (Bayer et al. 2017), potato (Vos et al. 2015) and apple (Bianco et al. 2014(Bianco et al. , 2016)).Considering the importance of KASP markers in genomics-assisted breeding, the objective of the present research was to develop and validate the SNP/allele specific trait-linked markers that target the genomic regions associated with improved germination and seedling vigour under deep-sown direct seeded rice cultivation conditions.To best of our knowledge this is the first study targeting development of trait-based SNP panel for the traits improving seedling vigor of rice under DSR.A set of core SNPs will be built via targeting variations in the already identified genomic region associated with DSR traits.

Materials and Methods
The present study was carried out at School of Agricultural biotechnology, Punjab Agricultural University, Ludhiana, Punjab, India.To understand the genetic control of rice seedling vigour under DSR, genome wide association studies for multiple seedling traits in 684 accessions from the 3000 Rice Genomes (3 K-RG) population in both the laboratory and in the field at three planting depths (4, 8 and 10 cm) was carried out (Menard et al. 2021).Best donors with favourable allele and significant marker-trait associations (MTAs)/QTLs for mesocotyl length, percentage seedling emergence and shoot biomass in this panel were identified (Menard et al. 2021).

Phenotypic Evaluation of Parental Genotypes and Breeding Material
The donors including Aus344, N22, Kula Karuppan, NCS237, and IRGC 128442, the five recipients' including PR126, PR121, PR128, PR129, PB1509 and a set of 256 F 3 :F 4 plants and 713 BC 3 F 2:3 plants from the abovementioned mapping populations were screened for the emergence from deeper soil depth in Kharif 2022 and 2023.The experimental design was completely randomized design (CRD) with three replications.Six seeds from each were sown in the plastic trays filled with soil at 4 cm, and 10 cm depth.In each tray, the donor possessing the longer mesocotyl length was kept as a positive check and PR126, PB 1509 was kept as a negative check.The seed sown at 4 cm depth was considered as a control for soil depth at 10 cm.The data on percent germination at a different soil depth, days to germination, mesocotyl length, root and shoot length were recorded.The data on percent germination at each sowing depth was recorded as (total number of seeds emerged out of the soil/total number of seeds planted) *100, days to germination was recorded in days, mesocotyl length with vernier calliper in mm, root and shoot length with centimetre scale.

Statistical Analysis
The means were calculated from the replicated observations.Means were used to draw the frequency curves to know the phenotypic distribution of the traits.The data was pooled from both the years.The analysis of variance (ANOVA) for completely randomized design (CRD) was calculated in STAR (Statistical Tool for Agricultural Research) version 2.0.1.The ANOVA model for CRD was as follows: where, Yij = Performance of the jth genotype in the ith block, μ = General mean, αi = Effect of ith treatment, eij = Error effect.

Whole Genome Resequencing
The genomic DNA of the five donors (Aus344, N22, Kula Karuppan, NCS237, and IRGC 128442) and five recipient backgrounds (PR126, PR121, PR128, PR129, PB1509) were isolated using the modified CTAB method.The quality was examined using the gel electrophoresis.The high throughput whole genome resequencing was carried out at NGB diagnostic, New Delhi using Illumina HiSEQ 4000.The sequencing involved genomic DNA (gDNA) library preparation following the Illumina Truseq protocol v3, resulted in 150 bp paired-end short reads in fastq format.The initial output yielded a total of 4 Gb of raw sequence data.To refine this data, the following steps were employed in the processing pipeline.The entire procedure used for creating the core trait-linked KASP marker panel, integral for future genomics-assisted breeding initiatives includes sequencing, read processing, read alignments, variant calling and designing of KASP markers.
Sequencing, read processing, alignments, and variant calling: Utilizing the Illumina HiSeq 4000 platform, the paired-end sequencing was executed at NGB Diagnostics Private Limited, New Delhi, India.Following this, the read processing commenced.For the subsequent bioinformatics analysis, the initial step encompassed the quality check and elimination of Illumina adaptor sequences.Additionally, quality trimming was implemented on the reads, entailing the removal of adaptor-clipped reads Yij = µ + αi + eij containing Ns.Moreover, to achieve a minimum average Phred quality score of 20 across a ten-base window, 3′-end trimming was performed.Any reads concluding with a length below 20 bases were subsequently excluded from further analysis.
The O. sativa (version 7.0) reference sequence sourced from RGAP (Rice Genome Annotation Project, http:// rice.plant biolo gy.msu.edu/ pub/ data/ Eukar yoticProjects/ osativa/annotationdbs/pseudomolecules/version_7.0/all.dir/) was utilized for mapping.The sequencing reads were mapped against the reference genome using bwa tool (version 0.7.17-r1188).The default settings were used for the alignment parameters.The following analyses entirely incorporated the read pairs where only both the reads aligned as anticipated.
The Sam alignment format mapping files were converted into bam binary format using SAMtools (version 0.1.19)(Li et al. 2009).The Picard software (version 1.48) was used to detect and mark the duplicate entries in sorted bam files.The generated bam file served as an input file for the final variant calling using the Unified Genotyper software in GATK pipeline (Genome Analysis Toolkit, version 3.6).To facilitate comparative analysis and the identification of unique SNPs within donor parent variant files across all samples, the Bcftools tool (version 1.9) was applied to merge the variant files.Samples with a minor allele frequency (MAF) exceeding 2% and retaining at least 80% of the data were retained.The final step of variant calling involved filtering, accomplished using Vcftools (version 0.1.17)(Danecek et al. 2011).

Designing of KASP Markers
The KASP markers were designed using the offline Polymarker software (Ramirez-Gonzalez et al. 2015), while a combination of MAFFT, Exonerate, Primer3, Samtools, Bamtools, Bio-samtools, Blast software, and Glib 2.0 were employed within the system's path.The establishment of a reference genome database was accomplished through the BLAST tool, with subsequent indexing of the reference genome facilitated by samtools to generate a dedicated index file for the genome.
In past ten years efforts have been made at IRRI, Philippines and PAU, Ludhiana in the identification of donors and genomic regions associated with traits improving seedling vigor of rice under direct-seeded cultivation conditions (Menard et al. 2021;Sandhu et al. 2023)   ) on chromosome 8 showed association with % germination and mesocotyl length (Menard et al. 2021).Earlier Redoña and Mackill (1996) reported genomic regions from 16.84 to 29.58  Mb on chromosome 3 that showed association with mesocotyl and coleoptile length in rice.
Variant calls pertaining to specific gene/QTL regions were extracted from the VCF files resulted from the SNP calling process.The flanking regions surrounded the SNPs were extracted from the reference genome using bedtools.These extracted SNP regions were then assembled into the desired format for the Polymarker software, incorporating essential details such as ID, chromosome number, and variant calls, along with 100-bp flanking regions on each side, formatted in CSV (Comma-separated values).These meticulously prepared files were subsequently employed as input data for the Polymarker software.

Filtering and Selection of KASP Markers
The markers located in the earlier identified genomic region associated with the traits improving seedling vigor of rice under DSR were screened.All the gene files for the reference genome were retrieved from RGAP.The selected markers then screened for the donor specificity using the merged variant file created using BCF tools.All the shortlisted markers were aligned with the reference genome using BLAST and the markers showing alignment at the multiple loci were rejected.Only the high specificity markers aligning at desired locus with low e-value were selected.
From the high-quality sequences, a total of variants at 10 × were 2,89,38,981 with average variant per sample 2,630,816 (Additional file 1: Table S2).The number of variants at 10 × was highest in Pusa Basmati 1509 (34,01,748) and lowest in Kula Karuppan (184,335).The genome sequence of all the 5 donors was compared with each of the six recipient backgrounds for the identification of SNPs.The designed KASP assays were very informative for the rice germplasm constituting 5 donors and 6 recipient backgrounds.Out of 58 designed KASP assays, four KASP assays (K_10517640, K_21414536, K_21515581, and K_19899355) did not show any polymorphism for any of the recipient background used in the present study.The few examples of KASP assays on the 10 parental genotypes is presented in Fig. 2. Of the total 54 polymorphic KASP, 45 KASP were localized within the MSUv7 gene models (http:// rice.plant biolo gy.msu.edu), and 9 KASP markers were located within the intergenic regions (Additional file 1: Table S3).The 9 KASP markers located in the intergenic region of chromosome 7 (Additional file 1: Table S3).The highest quality SNPs were detected for the KASPs, K_33070058  IRGSP1.0 (International Rice Genome Sequencing Project (Additional file 1: Table S3).
The average physical distance between the two polymorphic KASP markers on chromosome 3 was 505 kb or ~ 2.070 cM for qSD 3.1 and 13.16 kb or ~ 0.0539 cM for qSD 3.2 considering 1 cM equal to ~ 244 kb (Chen et al. 2002).The average physical distance between the two polymorphic KASP markers in genomic region associated with qSD 4.1 on chromosome 4 was 35.17 kb or ~ 0.144 cM.Further, this average distance was 228.86 kb (~ 0.978 cM), 293.43 kb (~ 1.203 cM) and for 30.17 kb (~ 0.124 cM) qSD 7.1 , qSD 7.2 and qSD 8.1 , respectively.

Quality Parameters of KASP Markers
The quality of each of 54 polymorphic KASP markers for the traits associated with seedling vigor was assessed based on the parameters such as utility, False positive rate (FPR) and False negative rate' (FNR) of the KASP markers (Table 1).The quality was assessed on the 256 F 3 :F 4 plants from seven populations and 713 BC 3 F 2:3 plants derived from the eight backcrossed populations.The utility of the KASP markers ranged from all the six recipient backgrounds to only one or two different recipient backgrounds.The allelic effects of all the polymorphic KASP on the phenotypes of the F 3 :F 4 and BC 3 F 2:3 derived populations are described for the %germination and mesocotyl length traits in Table 3.The FPR and FNR of the KASP assays in F 3 :F 4 mapping populations ranged from 0.0114 to 0.1316 and 0.0 to 0.0952, respectively (Table 1).While, in the BC 3 F 2:3 populations the FPR and FNR of the KASP assays ranged from 0.0643 to 0.25 and 0.0055 to 0.1181, respectively (Table 3).The utility of KASP assays varies from all five genetic backgrounds to one background only.A total of 13 KASP assays showed utility in five recipient backgrounds, 15 KASP assays in any of the four recipient backgrounds, 5 KASP assays in any of the three recipient backgrounds, 8 KASP assays in any of the two recipient backgrounds and 13 KASP assays in any of the one recipient backgrounds.Out of the 13 KASP that showed utility in all 5 genetic backgrounds, 8 KASP were present on chromosome 3, three were on chromosome 7 and two KASP on chromosome 8.Out of 15 KASP assays that were polymorphic for four genetic backgrounds three KASP belonged to chromosome 3, one KASP belonged to chromosome 4, seven KASP to chromosome 7 and four KASP to chromosome 8.Further, the 5 KASP that showed polymorphism with three recipient backgrounds were present on chromosome 3 (1 KASP), chromosome 4 (3 KASP) and chromosome 7 (1 KASP).The detailed information on each of the KASP markers showing utility to each of the six recipient backgrounds, their allelic interpretation FPR, FNR are presented in the Table 3.

Genetic Relationship
The genetic relationship among the eleven genotypes including 5 donor and 6 recipient backgrounds was studied using genetic diversity and Principal Component Analysis (PCA).The UPGMA (unweighted pair group method with arithmetic mean) cluster analysis showed that the 11 rice genotypes were divided into two major groups (Fig. 3).All the recipients except the MTU1010 were present in Group I.The donors along with the upland adapted genotype MTU1010 constituted the Group II, which is further divided into two subgroups.Chr: chromosome, bp: base pair, Ref allele: allele present in the reference genome, positive allele: allele present in the donor parent, FPR: false positive rates, FNR: false negative rates, frequency (%) negative trait: number (percent to the total) of the breeding lines possessing recipient parent allele, frequency (%) positive trait: number (percent to the total) of the breeding lines possessing donor parent allele, phenotypic mean negative trait: mean value of the breeding lines possessing recipient parent allele, phenotypic mean positive trait: mean value of the breeding lines possessing donor parent allele, KASP utility: the percentage of a prospective background across which the SNP marker could be used to introgress the positive allele associated with the trait of interest, False Positive Rate' (FPR): the proportion of breeding lines with recipient allele but identified as not having an unfavorable/recipient allele of the SNP marker.It was calculated as the number of breeding lines withOUT recipient allele/Total number of breeding lines with recipient allele, False Negative Rate (FNR): the proportion of breeding lines with donor allele but identified as not having the desired QTL/donor allele.It was calculated as: # number of breeding lines with-OUT favorable allele/Total number of breeding lines with donor allele.The significance level indicates the allelic effects of the KASP assays on the mean phenotypic values of the NILs and RILs estimated using Kruskal-Wallis test.*Significance at < 5% level, **significance at < 1% level, ***significance at < 0.  The subgroup I had MTU1010, where, the subgroup II had other five donor backgrounds.The genotypes with Aus background Aus344, N22 and IRGC128442 were present in one subgroup whereas, the indica genotypes Kula Karuppan and NCS237 were present in another subgroup.Similarly, the recipient PR128 and PR129 were present in one subgroup and PR121, PR126 and Pusa Basmati 1509 in another subgroup.

Phenotypic Validation of the KASP Assays
All the 54 KASP assays which produced satisfactory results in parental polymorphism survey of the eleven genotypes were validated against the phenotypic performance of the F 3 :F 4 and BC 3 F 2:3 progenies.The allelic patterns of the ten genotypes for the 13 KASP assays that showed polymorphism across all the recipient backgrounds associated with traits improving germination of rice under deep sown direct seeded cultivation conditions is presented in Fig. 4. The allelic effects of the 54 KASP assays on the mean phenotypic values of the F 3 :F 4 and BC 3 F 2:3 progenies were assessed using the Kruskal-Wallis test and described in Table 3.The allelic effect for the KASP assays were found significant at P ≤ 0.05 in both the F 3 :F 4 and BC 3 F 2:3 progenies.The 54 phenotypically validated KASP assays include 16 assays for the genomic region associated with % germination and mesocotyl elongation on chromosome 3, 9 assays for the genomic region on chromosome 4, 23 assays for chromosome 7 and 6 assays for chromosome 8 (Table 3).
The few examples of KASP assays on the F 3 :F 4 and BC 3 F 2:3 progenies are presented in Fig. 5.The alleles associated with the improved germination and mesocotyl elongation showed significant improvement in germination and longer mesocotyl under deep sown direct seeded cultivation conditions.The F 3:4 progenies carrying the alleles for improved germination showed 56.37-81.26%germination and 3.65-5.7 cm mesocotyl length compared to progenies carrying reference alleles 15.02-36.13%germination and 0.8-1.81cm mesocotyl length when sown at 10 cm deep from the soil surface (Table 1).Similarly, the BC 3 F 2:3 progenies carrying the alleles for improved germination showed 54.65-83.3%germination and 3.19-5.84cm mesocotyl length compared to progenies carrying reference alleles 14.26-31.04%germination and 0.5-1.6 cm mesocotyl length when sown at 10 cm deep from the soil surface (Table 1).

Reliability and Selection of KASP Assays
The designed KASP assays were validated first on a set of 11 parents followed by the second level validation on the 15-20 predicted F 1 s plants developed from each cross considered in the present study.The third validation was carried out on F 3 :F 4 and BC 3 F 2:3 progenies.Further, the repeatability of the KASP assays was accessed on a set of random 50 samples from F 3 :F 4 and 80-100 samples from BC 3 F 2:3 progenies using 10 random markers.Based on the FPR, FNR, KASP utility in different genetic backgrounds and significant differences in the phenotypic values of the positive (desirable) and negative (undesirable) traits, Fig. 3 The genetic diversity analysis of the 11 genotypes including 5 donors (Aus344, N22, Kula Karuppan, NCS237, IRGC 128442) and 6 recipient background (PR121, PR126, PR128, PR129, MTU1010, Pusa Basmati 1509) using the whole genome resequencing data Fig. 4 The allelic constitution of the 10 genotypes including 5 donors (Aus344, N22, Kula Karuppan, NCS237, IRGC 128442) and 5 recipient background (PR121, PR126, PR128, PR129, Pusa Basmati 1509) for the 13 KASP assays that showed polymorphism across all the recipient backgrounds associated with traits improving germination of rice under deep sown direct seeded cultivation conditions Fig. 5 The pictorial representation of the KASP assays conducted on the including 5 donors (Aus344, N22, Kula Karuppan, NCS237, IRGC 128442) and 5 recipient background (PR121, PR126, PR128, PR129, Pusa Basmati 1509) used to develop the breeding panel and KASP assays on the breeding panel constituting F 3 :F 4 and BC 3 F 2:3 progenies.PS: polymorphism survey on the 10 genotypes.Blue color indicates the donor allele, red color indicates the recipient allele and green color indicates the heterozygotes a total of 12 KASP assays i.e.K_16856978, K_19041692, K_33072076, K_33079643, K_33107252, K_20771274, K_13314239, K_13430534, K_14713452, K_19899233, K_19914183 and K_19914306 have been selected.These 12 KASP include 5 KASP on chromosome 3, 1 on chromosome 4, 3 on chromosome 7 and 3 on chromosome 8.The R 2 and p value calculated using single marker analysis in WinQTLCart V2.5 (Wang et al. 2012) of the 12 KASP assays are presented in Additional file 1: Table S4.The schematic representation of the distribution of KASP assays associated with seedling vigor traits on the chromosomes 3, 4, 7 and 8 of rice is presented in Fig. 6.

Discussion
Development of molecular marker linked to the phenotypically important trait such as seedling vigor under deep sown direct seeded cultivation conditions are of great importance especially when trait phenotyping is laborious and difficult.In rice, where a strong focus on development of DSR adapted rice varieties has led to the introgression of various QTLs/gene providing adaptability to rice under DSR (Menard et al. 2021).Evaluation of below ground traits is not always straightforward because of various factors including soil, environment and technical/manual error and difficulties in measurement of traits such as mesocotyl elongation.Recent advancement in genomics offers several genomics-assisted breeding strategies such as the use of molecular markers to overcome these problems.Since 1980s, the breeders employed various kinds of molecular markers in cereal breeding, such as, RAPD (Random Amplified Polymorphic DNA), RFLP (Restriction Fragment Length Polymorphism), AFLP (Amplified Fragment Length Polymorphism), SSR (Simple Sequence Repeats) and STS (Sequence Tagged Sites) markers.The use of molecular markers has been successfully reported in crops rice (Jena and Mackill 2008), maize (Prasanna et al. 2010), wheat (Miedaner and Korzun 2012), barley (Miedaner and Korzun 2012), and sorghum (Mohamed et al. 2014;Rooney and Klein, 2000), for several traits to improve the efficiency of traditional breeding.
With the rapid progress in genomics, several initiatives including high throughput sequencing, identification of genomic regions associated with traits of interest, transcriptome or RNA sequencing have facilitated the development of functional molecular markers linked with the functional variants governing the trait variation.The identification of traits linked alleles/markers of choice are underway to achieve the targets in modern genomicsassisted breeding programs (Varshney et al. 2005).The concept of development of highly accurate SNP has now provided opportunities to target allelic variation improving yield and adaptability of rice under DSR.The whole genome resequencing of 11 genotypes in the present study provides about 2.8 million different types of SNP information which will help the breeders in mining the useful information about the SNPs.The high-throughput and cost-effective whole genome sequencing platform used in the present study to develop the trait linked KASP assays may help to maximize the genetic gains especially for complex traits under DSR (Semagn et al. 2014;Zhao et al. 2014).Therefore, the identification of the core trait-linked significant SNPs is must in genomics-assisted breeding.Till date, the diagnostic markers related to abiotic-biotic stress tolerance resistance such as rtsv1, Xa4, xa5, xa13, Xa23, Xa21, Xa7, Sub1A, (Lee et al. 2010;Li et al. 2001;Iyer and McCouch 2004;Dilla-Ermita et al. 2017;Chu et al. 2006;Peng et al 2015;Septiningsih et al. 2009) root traits improving nutrient uptake under DSR such as qNR 4.1 , qNR 5.1 , qRHD 1.1 , qRHD 5.1 , grain yield under DSR such as qGY 1.1 , qGY 6.1 , qGY 10.1 , grain yield under reproductive stage drought stress such as qDTY 1.1 , qDTY 2.1 , qDTY 3.1 , qDTY 12.1 (Sandhu et al. 2022) and quality traits such as ALK, Wx, GS3 Pikh, GW5, and CHALK5 (Gao et al. 2003;Bao et al. 2006;Dobo et al. 2010;Teng et al. 2017;Takano-Kai et al. 2009;Yang et al. 2019) have been reported.
The major challenge faced in designing the KASP assay was to identify the significant SNPs specifically linked with the particular donor/trait of interest and polymorphic with the multiple recipient backgrounds to be further used for the genomics-assisted breeding program.Finally, 54 KASP assays out of the 58 successfully-designed KASP were able to display the diversity at the loci.The 54 promising KASP assays with significant p-value being reported here showed significant association with the relevant phenotypes in the diverse donor/ recipient backgrounds, recombinant and nearly isogenic breeding populations panel, thus revealing their potential application in the DSR breeding programs.The 54 polymorphic KASP assays fulfilled the criterion of quality control, allelic variations of the targeted donor to the recipient backgrounds, and strong association with the key/functional genes associated with traits improving seedling vigor under DSR.To the best of our knowledge the present study is the first report targeting development of trait-based KASP assays for the traits improving germination and mesocotyl length of rice under deepsown DSR.
The genotyping results of the F 1 plants, validation of KASP assays on multiple biparental populations and the better confidence values with very low false positive and false negative rates demonstrated the high levels of repeatability, accuracy, and the robustness of the KASP assays developed in the present study.These results are comparable to the results reported in other panels and genotyping platforms (Simen et al. 2015;Misyura et al. 2016;Thomson et al. 2017;Cai et al. 2017).The mean repeatability of KASP assays estimated in the present study was about 99%, the 1% dissimilarities between the predicted and the F 1 calls could be explained by the genotypic errors of KASP assays.The accuracy and robustness of the KASP assays to call the heterozygous genotypes makes them suitable for genotyping the segregating populations, marker assisted backcross populations and to make genomic prediction in the segregating populations.
The selected 12 KASP assays with significant p-value and phenotypic variance (R 2 ) (Additional file 1: Table S4) provide a platform for the foreground marker-assisted selection/introgression of traits improving germination of rice under deep sown DSR conditions.The selected 12 KASP array may be useful in constructing a set of nearly isogenic lines suitable for the deep sown DSR cultivation as the identified significant SNPs can be used to select the favourable alleles in a wide range of genetic backgrounds.The selected 12 tightly linked set of KASP assays can also be used for dissecting the linkage drag.The predictive abilities of the selected KASP assays obtained in this study suggest that these assays may be sufficient and costeffective for the screening of germplasm possessing traits improving seedling vigor in deep sown DSR situation.The detection of haplotypes around the target favourable alleles can further be utilized for the fine genetic dissection of the genomic regions near the targeted genomic region.
We conducted an examination of how the SNP variants influence the protein structures to gain insights.Among the 54 SNPs selected for KASP assay design, we identified 25 located in intergenic regions, 13 within genic regions but situated in the introns of their respective genes, 4 within the untranslated regions (UTRs) of genes, and 7 within the coding regions of genes that produce translated proteins.Upon additional scrutiny, it was observed that 2 out of the 7 SNPs found in the exon regions led to synonymous mutations, while the remaining 5 resulted in missense variants.Further, we used SIFT (Sorting Intolerant From Tolerant) tool for predicting whether the missense variants are likely to affect the protein function based on sequence homology and the physico-chemical similarity between the alternate amino acids.Out of the 5 variants, 2 got a SIFT score of less than 0.05 thus indicating a possible deleterious effect of the SNP variants to the protein structure (Additional file 1: Table S3).In future, we are planning to do functional studies for the 2 genes (LOC_Os04g34290 and LOC_Os08g32100) containing our 2 validated SNPs causing a possible deleterious mutation for the protein product.
The identified and validated KASP associated with seedling emergence would be desirable for markerassisted introgression of traits providing to rice when sown deep under DSR into high-yielding modern cultivars.The heat map of the F 3 :F 4 (Fig. 7A) and BC 3 F 2:3 (Fig. 7B) indicating the frequency of the favorable alleles associated with germination of rice when sown deep.Development of genotypes with high seedling emergence under deep-sowing and tolerance to low oxygen during the seedling germination when the deeply sown rice seeds receive the unexpected early rains, as well as the high-vigor, weed competitiveness, and yield potential would likely be a successful strategy for DSR breeding.This will lead to the development of improved cultivars that are well-adapted to DSR and for making the long-term genetic gains.

Fig. 1
Fig. 1 Phenotypic evaluation of F 3 :F 4 and BC 3 F 2:3 for different traits improving seedling vigor of rice under control (4 cm) and deep sown (10 cm) direct seeded cultivation conditions.A Phenotypic evaluation of parental lines (PR126; one of the recipient parent and IRGC 128442; one of the donor parent) at 4 cm and 10 cm of sowing depth.B Phenotypic variations of mesocotyl length in the PR26 and IRGC 128442 parental lines at 5 DAG (days after germination).C Phenotypic evaluation of F 3 :F 4 and BC 3 F 2:3 progenies under control (4 cm) and deep sown (10 cm) direct-seeded cultivation conditions under screen house conditions

Fig. 6
Fig. 6 Schematic representation of the distribution of all the designed 58 KASP assays associated with seedling vigor traits along the four chromosomes of rice.The alternate SNP ID (K_followed by numeric value) showing genomic position in base pairs representing the physical position of the SNPs on the chromosome.The numbers below each chromosome indicate chromosome numbers.The four KASP assays with red color indicate KASP assays that were non polymorphic in parental survey.The remaining 54 KASP assays were polymorphic (blue color) and the green color indicates the most reliable 12 KASP (out of 54 polymorphic KASP assays) including 5 KASP on chromosome 3, 1 on chromosome 4, 3 on chromosome 7 and 3 on chromosome 8

KASP FNR Total number of genotypes tested Frequency (%) Phenotypic mean_% germination Phenotypic mean_ML Total number of genotypes tested Frequency (%) Phenotypic mean_% germination Phenotypic mean_ML Significance level Negative trait Positive trait Negative trait Positive trait Negative trait Positive trait Negative trait Positive trait Negative trait Positive trait Negative trait
*Table 3(continued)